home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Amiga Tools 3
/
Amiga Tools 3.iso
/
grafik
/
raytracing
/
rayshade-4.0.6.3
/
inetray
/
poo
< prev
next >
Wrap
Text File
|
1993-08-15
|
17KB
|
329 lines
======================================================================
P O O
doc: Thu Apr 2 11:59:51 1992
dlm: Wed Jul 21 14:37:58 1993
(c) 1992 ant@ips.id.ethz.ch
uE-Info: 266 0 NIL 0 0 72 3 2 8 ofnI
======================================================================
This file describes some interna of the inetray-packet. It's name
derives from Principles of Operation.
Overview
--------
The program inetray is responsible for dispatching and scheduling the
rayshade requests. In the usual terminology it acts as the client
requesting services from a number of remotely running servers. It does
that using SUN RPC. Rendering requests do not block, therefore inetray also
listens continuously on a socket to check for incoming results. The data
which is received from the workers is written to the file whenever this
is possible.
The program rpc.inetrayd serves two purposes: it services a number of
rpc-requests dealing with initialization and management. Whenever it
receives a rendering request, it spawns of a worker child and continues
to service rpc requests (a restricted number).
The worker now renders a part of a frame and then directly contacts the
dispatcher to send it the result. This is done using a XDR/TCP
connection.
inetray.start is a simple RPC daemon servicing requests for starting the
rpc.inetrayd servers.
Rayshade Libraries
------------------
Inetray uses the standard Rayshade libraries (as used in version 4.0.6).
Great care has been taken to avoid having to change the libraries at
all. As long as the interface stays the same no change is required for
Inetray even if rayshade evolves.
At times this decision not to change the rayshade source lead to
complicated and maybe clumsy solutions. But in the interest of
portability it has nevertheless been adhered to strictly.
There is one currently unsolved problem arising from this decision: it
seems that the random generator used for textures is different on big
and little-endian machines. Therefore they don't mix. A solution has
been promised for rayshade 5 by Craig Kolb but so far it isn't clear
when it'll be out.
Input
-----
rayshade accepts its input in various ways:
1) rayshade "filename" (input is file)
2) ... | rayshade (input is stdin/pipe)
3) rayshade < "filename" (input is stdin/file)
4) rayshade (input is stdin/keyboard)
Inetray from version 1.1.0 on provides total compatibility with all
those possibilities. It does this by buffering stdin in a live buffer
(see below). RSInitialize() is then called taking its stdin from this
buffer. Note that in case 1 the buffer is not needed and should indeed
disappear because it competes with inetray for the keyboard. This
problem is solved in a very simple manner: on buffer startup SIGINT is
set to kill the buffer; once the buffer encounters an eof (i.e. cases 2
3 & 4, where eof is necessarily encountered before RSInitialize()
returns) SIGINT is ignored. Inetray (the parent) sends a SIGINT to the
buffer on return from RSInitialize. If the buffer still has the stdin
open (necessarily the keyboard) it is killed, otherwise it continues
running.
Case 1 was the only one allowed for Inetray up to version 1.0.1. It
requires the input file to exist on all worker-machines. To increase
the flexibility, the inetray workers try to have the ``same''
working-directory as the dispatcher (see section Pathnames below).
Note that even when stdin is used for input, the input can contain
references to other files to be read in, namely cpp #includes and
height-fields. Those files must be accessed much in the same way as the
files in case 1 (see above and section Pathnames). If no such files
must be included then no file has to be accessible on the worker
machines.
Case 4 is handled much like cases 2 & 3.
Live Buffers
------------
A live buffer is just a forked process which first reads from one
filedesc into memory (malloc'ed) and then writes the contents of the
buffer to another (in some cases two) filedesc before terminating.
End of input is detected when either an eof is reached or a \0 is read
as the last character of a read() syscall. This feature allows to use
live buffers to read from TCP connections which should not be closed.
Live buffers are more expensive than writing a temporary file for large
amounts of data and worsen the already problematic memory situation but
they avoid having servers writing files which could be a possible
security problem (see below).
Note that live buffers terminate automatically once their respective
parents disappeared. This is due to the fact that eventually they will
encounter an eof on the input filedesc and start writing. They always
write to a pipe so when the last reader of the pipe died they die on a
SIGPIPE.
Authentication & Security
-------------------------
If the servers (rpc.inetrayd) are started as root, they try to change to
the user id supplied to them. This is usually the user id of the user
running the dispatcher (inetray). Any user, however can set a different
user id for servers started by inetray.start. No server can run as root
(uid == 0).
If the uid is illegal on the server it exits with an error message in
the syslog.
No server ever produces an output file. This therefore limits the
security concerns to changing the access time of files. Of course it is
possible that there are loopholes in this concept; I just haven't found
one yet.
If the server is not started as root, it will continue to run under the
uid it was started as. One has to check the permissions of the accessed
files for reading access for that user.
The actual usernames under which the servers are running is diplayed by
both inetray and inetray.ping.
Session Keys
------------
Whenever a started server receives the first request, a session key is
sent with that. Once a sessions key is installed, only requests with the
same key are serviced. In practice this means that only the person who
issued a inetray call can kill the running servers and workers. The key
is stored in the file .inetray.key in the current directory where
inetray was issued. An eventually existing file is renamed to
.inetray.key.old.
inetray displays the current session key on startup.
inetray.ping uses the special key 0. Therefore, if servers hang after a
inetray.ping, they can be killed with inetray.kill 0.
The program inetray.kill needs a session key supplied with. If one is
given as an argument, this takes precedence. If no key is supplied,
inetray.kill looks for one in the file .inetray.key.
Version Numbers
---------------
Both inetray and rpc.inetrayd know about their version number. The
server passes this back to inetray and inetray.ping upon reception of
the first request. Only if the first character (i.e. mayor version
number) of this version number matches, the worker is accepted.
Pathnames
---------
Since servers can be running on machines with totally different
filesystems but may need to access the inputfiles locally, some
pathname substitution is supported.
All filenames are transferred as-is to the servers/workers. If they
start with a / they are absolute path-names starting at the respective
root of the machines. This will probably not work well on all but the
most homogenous networks.
If they don't start with a / they are relative names starting in the
current working directory. From the working directory where the client
is started the home-part is stripped if possible. This stripped
directory is then sent to the server which in turn adds the home
directory of the uid it is to run as.
Note that if nothing was stripped on the client-side, then nothing is
added on the server-side. Note also that the right directory is chosen
even when the server cannot run under that user id.
The server tries to chdir to the directory so constructed. If that
fails it continues to run in the current directory which is the
directory where it was started from. The working directories of the
servers are displayed by both inetray and inetray.ping.
The practical abshot is that if you have the same sub-directory
structure below your home on the different machines, you can start
Inetray in all these directories and the servers/workers will cwd() to
the right sub-dirs as well.
Port Numbers
------------
The rendered portions of a frame are sent back using a XDR/TCP
connection. The portnumber for this is defined in config.h (RESULTPORT)
but can be overridden for each user in the .inetrayrc file.
Registering Servers
-------------------
Whenever inetray or inetray.ping are started, they try to register ready
servers.
First, the servers started by inetray.start are started; the servers
started by inetd are started automatically when an INIT-request arrives.
The order in which the machines are contacted is the following:
1: All simple hosts given in the Use List (if any)
2: All directed broadcasts addresses in the Use List (if any)
3: The Local Network (if option N=0 is not set in the Use List)
After starting, an INIT-request is sent to all machines. Servers that are
to be started by inetd, are started automatically when they receive an
INIT-request. The same order applies.
Servers reply by opening a TCP-connection on the result-port and sending
back status info.
Answers may be ignored for two reasons: either the hostname appears
(exactly as given) in the ignore list in the current .inetrayrc or the
mayor version number of the server does not match that of the
dispatcher.
If the input comes from stdin, then the contents of the live buffer (see
above) of the dispatcher is sent to live buffers on the server machines
using the same TCP-connection. This is, however, only done once
registering is otherwise completed (i.e. the list of registered machines
is complete).
Work Scheduling
---------------
A frame is divided into blocks encompassing > 1 lines. This is done
according to a simple heuristics the parameters of which can be
controlled by editing config.h and/or overriding those values in a
.inetrayrc file (see INSTALL/Appendix B for details).
After n workers have been registered, the block size is calculated as
follows: blockSize = ySize / blocksPerServer / n. After that, the size
is checked against the lower and upper limit (MINBLOCKSIZE resp.
MAXBLOCKSIZE). If it exceeds a limit, it is adjusted accordingly. After
that, the size of the last, possibly incomplete, block is calculated and
the information printed.
In early versions (up to [0.2.0]), a simple round robin scheduling has
been used: subseqent machines got subsequent blocks to trace; whenever
the end of a frame was reached, the whole process started over with only
the non-terminated blocks.
This could lead to quite bad behaviour in the end. Consider for example
the example file mole.ray. Early blocks (bottom half) take much longer
to trace than later ones. If now one machine is heavily loaded, it won't
ever complete its block. This means that there will one early block be
outstanding for a very long time wich will inhibit concurrent writing.
Furthermore, with a little bit of bad luck, this block will be the last
one outstanding which will mean that a lot of machines will calculate
just one block in the end. This block will take a long time to
calculate.
Starting with version [0.2.1] there is a rescheduling inserted in the
middle of a frame. The number of machines which did not yet return a
result is counted and the first n blocks (n being the number of those
machines) not yet calculated are given priority over other blocks. These
blocks are exactly those residing on those slow machines. Hopefully,
these are distributed to faster machines like this.
I my setting, this modification lead to quite a decrease in time needed
to complete the last block.
Notes: - The scheme presented here also works nicely if workers crash
during the first half of a frame (which they seem to tend to
do).
For version 2.0.0 the scheduling has changed yet again. For images where
all the hard work is done in a small part of the picture the old
scheduler didn't work very nicely. To solve this problem the following
scheduler has been implemented:
- During the first round of work scheduling (i.e. until all
blocks have been dispatched once) the blocks are always
scheduled in pairs (i.e. one woker renders 2 blocks on every
request).
- When this 1st pass has been finished, only single blocks are
dispatched.
It's not clear if this scheduler is always better than the earlier
versions.
Concurrent Servers & RPC Program Numbers
----------------------------------------
It is possible for one machine to have more than one server (and worker)
running at a time. This feature is implemented to allow multiprocessor
machines to have as many workers as processors running. A machine
starting more than one worker cannot start it using inetd. Concurrent
servers have different RPC Program Numbers. The first server gets the
program number IRNUM defined in prognum.h. Subsequent servers get
subsequent program numbers.
Like that, registering with the portmapper works correctly. It must be
noted, though, that all broadcasts to servers now must be broadcast for
all program numbers.
Error Logging
-------------
The general mechanism is described in README and SUPPORT.
Please note that also all errors produced by the rayshade routines are
logged. This is done using a funny redirection of the stderr to the
syslog using socketpairs and async I/O. For this to work under AUX I had
to implement the socketpair() syscall there, since the one built in does
not work (at least in our version).
Error Termination
-----------------
Roughly once every minute, every server checks if the dispatcher is
still running. If that's not the case, it kills it's associated worker
if it has one and then exits with an entry in the syslog.
As from version 2.0.0 the server also checks the exit status of its
child once every minute. If the child exited with a status != 0 it shuts
itself down. This non-zero exit status can be due to two different
reasons: either the rayshade libraries exited explicitly or the worker
was terminated with a signal (either implicitly (bus error, segmentation
violation, ...) or explicitly (it annoyed either your sysadm or
yourself)).
Socket State
------------
Seems to me there's no clean way to extract the correct state of a
socket without reading kernel memory. Nevertheless, the connection state
must be retrieved for checking the state of the dispatcher. In a first
test getpeername() was used. Unfortunately it returns the peername of
the dispatcher even if that one has been killed (and the socked is in
CLOSE_WAIT/FIN_WAIT_2 state).
Up to version 1.0.1 select()'ing the socket for reading did the trick
since it was used only as a one-way server->dispatcher connection. Thus
being ready for reading meant an error.
Later versions use the TCP connection to send the stdin (see Live
Buffers above). Therefore checking the state of the connection means
selecting it for read and testing it being empty at the same time.
There's no UNIX syscall to do this. If it can be guaranteed that nobody
reads the socket between selecting it for read and testing it for
emptyness then we succeeded. Unfortunately there is a Live Buffer which
reads the data written to the socket; this buffer is a separate process
which does not sync itself with the server.
The buffer can, however, not block forever in its reading state. It will
stop reading if the buffer on the dispatcher side is exhausted or
killed. After that it will start writing on the pipe. Therefore we can
disallow checking the dispatcher for life while the server buffer is in
reading state. It enters reading state immetiately when lPostBuffer() is
called. By selecting the pipe for reading we can find out when the
buffer is its writing state.
Note that the socket is never written to (by the dispatcher) unless an
INIT request has been successfully completed by the server. Therefore we
don't even have to check for emptyness of the socket - selecting it for
read whenever we can guarantee that the buffer is not reading it tells
us therefore if the dispatcher is still running.
If the buffer on the server side is killed before completing its reading
then the server also terminates assuming the death of the dispatcher.
This is ok. If the buffer dies during its writing period, the pipe is
closed and returns eof for the reader which results in an error and exit
there.
Rpc.inetrayd startup
--------------------
The server can be started up by inetd or inetray.start (or, for
debugging purposes, by hand). It checks its number of arguments to
decide how it was started up. If it is called without any arguments, it
assumes that it is started by inetd. Therefore you have to supply a
dummy argument if you want to start it by hand.